Capture-C: A modular and flexible approach for high-resolution chromosome conformation capture

Chromosome conformation capture (3C) methods measure the spatial proximity between DNA elements in the cell nucleus. Many methods have been developed to sample 3C material, including the Capture-C family of protocols. Capture-C methods use oligonucleotides to enrich for interactions of interest from sequencing-ready 3C libraries. This approach is modular and has been adapted and optimised to work for sampling of disperse DNA elements (NuTi Capture-C), including from low-cell inputs (LI Capture-C), as well as to generate Hi-C like maps for specific regions of interest (Tiled-C) and to interrogate multi-way interactions (Tri-C). We present the design, experimental protocol and analysis pipeline for NuTi Capture-C in addition to the variations for generation of LI Capture-C, Tiled-C and Tri-C data. The entire procedure can be performed in three weeks and requires standard molecular biology skills and equipment, access to a next-generation sequencing platform, and basic bioinformatic skills. Implemented with other sequencing technologies, these methods can be used to identify regulatory interactions and to compare the structural organisation of the genome in different cell types and genetic models.


Introduction Development of Capture-C methods
Chromosome conformation capture (3C) is a powerful method to measure the proximity of DNA elements within the three-dimensional confines of the nucleus 1 . All 3C methods follow a general principle of chromatin digestion and re-ligation, with minimal disruption of nuclear structure achieved either by fixation or careful buffering to maintain native conditions 2 . Chimeric ligation junctions are then assayed, with more frequent ligation between two distal fragments being a proxy for greater proximity. Originally, chimeric junctions were assayed directly in a low throughput manner using PCR with specifically targeted primer pairs 1 . The application of next-generation sequencing to assay ligation junctions has allowed high-throughput sampling of interactions in all-versus-all approaches, most commonly in situ Hi-C at relatively low resolution 3 , and many-versus-all approaches at high resolution, commonly with the Capture-C 4,5 or 4C-seq 6 methods.
The first Capture-C 4 method was established as a many-versus-all approach that used RNA oligonucleotide pull-down of restriction fragments of interest from in situ 3C material.
Subsequent sequencing allowed detection of interacting fragments in an unbiased manner. This approach was later applied to Hi-C libraries to develop Capture Hi-C (CHi-C), often called Promoter Capture Hi-C 7 , and most recently dubbed Enhancer Capture Hi-C 8 . Capture-C was improved by the application of biotinylated single stranded DNA (ssDNA) oligonucleotides for sequential "double capture" of indexed and multiplexed 3C libraries 5 . This improved method, Next Generation (NG) Capture-C, achieved 30-50% ontarget sequencing efficiency, with 10,000-100,000+ unique reporters per viewpoint. Mostimportantly, as 3C libraries used in Capture-C methods are indexed after sonication, PCR duplicates can be distinguished and excluded from analysis. The Capture-C approach can be divided into three distinct modules: 3C library generation, indexing, and enrichment (Fig. 1a). By careful optimisation of the library generation and indexing steps, the cell requirement was reduced from >1M cells to as few as 10,000 cells for Low-Input Capture-C (LI Capture-C) 9 . Subsequent work to reduce protocol inefficiencies from the in situ researchers have preferred 3C-qPCR 44 , which was thought to be more quantitative, but it does not achieve many-versus-all data as primers need to be designed and optimised for all fragment pairs of interest, resulting in extremely low-resolution profiles. Improvements to the sequencing-based 3C approaches, NG Capture-C 5 and UMI-4C 45 , provided greater depth of signal and allowed PCR duplicates to be filtered by use of unique sonication ends (this was possible with the original Capture-C as well), overcoming previous limitations and allowing high-throughput analysis at tens to hundreds of targets; with NG Capture-C providing the greater number of unique reporters per viewpoint 46 . The application of many-versus-all experiments to thousands of targets was initially limited to CHi-C 7,47 . By performing pull-down in Hi-C libraries, 3C material is enriched for successful ligation junctions, at the expense of library complexity due to the relative inefficiency of the molecular steps required to generate Hi-C libraries. Excluding these inefficient steps allows NuTi Capture-C to generate up to 1,000-fold greater depth of signal than Capture Hi-C 46 . CHi-C experiments, generally target in the region of 20,000 promoters or enhancers, but using infrequent-cutting enzymes in few replicates that result in low-resolution data (1-10 kb resolution with 100-1,000 interactions per viewpoint 46 ). However, with careful design and optimisation, NuTi Capture-C has been applied to ~8,000 active erythroid promoters at high-resolution in triplicate 10 , indicating that genome-scale experiments are no longer limited to lower-resolution approaches.
3C resolution can be increased by using deoxyribonuclease (DNase I) or micrococcal nuclease (MNase) to digest chromatin [48][49][50] instead of restriction endonucleases, as these enzymes have no specific cutting motif. MNase-digested 3C libraries were initially used in all-versus-all approaches 48 . Recently, we have reported a new approach in which MNase digestion is combined with a targeted enrichment method, similar to NuTi Capture-C. Micro Capture-C (MCC) provides super-high-resolution 3C data for selected viewpoints, and even permits the footprinting of transcription factor binding at promoters and enhancers 51 . Careful optimisation of micrococcal nuclease levels is needed to achieve super-highresolution data. Therefore, MCC requires tens of millions of cells and is currently not easily applied to low-abundance primary cell populations, in contrast to traditional Capture-C methods, though this will undoubtedly change as the MCC protocol is refined and optimised.
The Capture-C, Capture Hi-C, and MCC methods use defined sequence specific oligonucleotides for enrichment. Other many-versus-all approaches use 3C combined with immunoprecipitation of proteins (ChIA-PET 52 , PLAC-seq 53 , ChIA-DROP 54 , Hi-ChIP 55 ) or RNA (Hi-ChIRP 56 ) to achieve enrichment. These methods enticingly allow the simultaneous identification of protein binding sites or enhancers and their interactions. In reality the results are difficult to interpret because they are prone to bias caused by enrichment. This means they generally over report that sites enriched for the targeted molecule contact other similarly enriched sites. Mathematical and experimental quantification of the bias induced between two simultaneously enriched distant sites (i.e. co-targeting) shows that it is incredibly difficult to accurately correct 10 , as such, no method is generally applied in these hybrid technologies. In comparison, the defined nature of oligonucleotide pull-down in Capture-C, Capture Hi-C and MCC experiments allows the exclusion of biased fragments from analyses, providing more robust and interpretable findings.
Contiguous viewpoint targeting (many-versus-many)-Tiled-C was designed to combine the ability of all-versus-all 3C methods such as Hi-C to map large-scale chromatin structures including TADs, and the ability of one/many-versus-all 3C methods such as NuTi Capture-C to identify enhancer-promoter interactions within TADs at high resolution. While NuTi Capture-C targets disperse individual restriction fragments as viewpoints, Tiled-C uses a panel of capture oligonucleotides tiled across all contiguous restriction fragments within specified genomic regions. This allows for efficient enrichment for interactions within this region and thus for deep, targeted sequencing of these chromatin interactions. Although co-targeting of distal fragments induces enrichment bias 10 , the contiguous nature of Tiled-C designs avoids this bias as targeted fragments are not enriched more than other fragments within the targeted region. Advantages compared to Hi-C are that Tiled-C can create highresolution contact matrices of regions of interest at great depth in multiplexed samples for a fraction of the sequencing costs associated with genome-wide high-resolution Hi-C experiments. Other approaches which allow for many-versus-many analysis within regions of interest include methods such as Chromosome Conformation Capture Carbon Copy (5C) 57 , Targeted Chromatin Capture (T2C) 58 , Capture Hi-C (cHi-C) 59 , HYbrid Capture Hi-C (Hi-C 2 ) 60 , and Tiled-MCC 61 . An important advantage of Tiled-C compared to these methods is that it allows for high-quality data generation from as few as 2,000 cells 13 .
Single-allele multiway analyses-Most 3C methods, including Capture-C, 4C and Hi-C, focus on the analysis of pairwise interactions in cell populations. These methods therefore do not provide information about the higher-order assembly of chromatin structures and their dynamics in individual cells. The long ligation products in 3C libraries contain many ligation junctions between multiple DNA elements in a concatemer. These elements were in close proximity in the cell nucleus at the time of fixation. Therefore, analysis of sequencing reads with multiple junctions allows for the investigation of multi-way chromatin interactions between DNA elements in individual nuclei. Tri-C was developed to identify such multi-way interactions with viewpoints of interest with high sensitivity and at high resolution 11 . By using a restriction enzyme to create small restriction fragments at the viewpoints of interest − usually NlaIII − and creating longer sonication fragments, multiple interacting fragments can be analyzed efficiently using high-quality Illumina sequencing. Compared to other recently developed approaches to detect multi-way chromatin interactions, such as chromosomal walks, three-way 4C 62 and multi-contact 4C (MC-4C) 63,64 , Tri-C offers advantages in throughput, sensitivity, and resolution, as well as careful quantification of interaction frequencies due to robust PCR duplicate filtering 11 . Other recent innovative techniques, such as genome architecture mapping (GAM) and Multiplex-GAM 65,66 , split-pool recognition of interactions by tag extension (SPRITE) 67 , DNA seqFISH+ 68 , and single-cell Hi-C 69-74 , also allow for investigation of chromosomal organization in single cells. Since the resolution of these techniques at the moment is limited, these methods have predominantly contributed to our understanding of chromosomal structures in single cells at relatively large-scale, rather than at the level of individual regulatory DNA elements.

Experimental design
Enzyme Selection for Resolution-While theoretically any restriction enzyme could be used in 3C, only a few enzymes efficiently digest chromatin, especially when it is heavily crosslinked. The choice of restriction enzyme for generation of 3C material is the largest determinant of experimental resolution; Capture-C libraries use the 4-base cutters (NlaIII, DpnII) which cut approximately 16-times more frequently than 6-base cutters (HindIII).
Whilst the higher resolution provided by 4-base cutters allows for distinguishing interactions of nearby elements, deeper levels of sequencing are required to deal with the more complex sequencing libraries. Generally, it is best to perform experiments at high-resolution and select the restriction enzyme based on its cut sites at targets of interest. Since interactions are detected as newly formed ligation junctions between the ends of restriction fragments, the enzyme should be selected based on the proximity of cut sites to the element of interest (< 2 kb linear distance) and the ability to design effective probes for regions of interest. The targeted fragment should be either overlapping with or very close (<2 kb) to the genomic element of interest and be large enough to accommodate binding of enrichment oligonucleotides (70-120 bp), but not so large that probes are a long way from the element of interest. NuTi Capture-C with a single oligonucleotide per viewpoint is possible; however, this results in lower data depth than with two oligonucleotides − one targeting each end of the fragment. While still providing informative profiles, fragments shorter than 250 bp have been shown to have higher levels of trans interactions than longer fragments within the same 3C library 10 , therefore optimal fragment length is 250-1,000 bp. The sequence underlying the oligonucleotides is also an important consideration. Duplication or high sequence similarity (determined using BLAT 78 and RepeatMasker 79 with Capsequm2) of oligonucleotide sites will result in off-target pull-down. For some loci (e.g. the alpha and beta globin genes) interaction profiles are still easily interpretable despite duplication, whereas for other genes (e.g. glycophorin encoding genes) high sequence similarity results in data which are harder to interpret; a limitation which is common to most sequencing based 3C methods. Oligonucleotides likely to result in off-target pull down can generally be avoided by selecting an adjacent fragment, or changing restriction enzyme. 20,000 unique reporters for NuTi Capture-C 10 . A single MiSeq run generating 20M pairedend reads should therefore provide sufficient sequencing coverage for 5-25 viewpoints in six 3C libraries. Some analytical tools for calling interactions, such as peaky 80 and peakC 81 also benefit from having numerous viewpoints, as this allows for generation of an accurate background model of non-specific polymer interactions.
For targeting of specific disperse elements (NuTi Capture-C, Tri-C), it is important not to simultaneously enrich at two sites whose direct interactions you are interested in, for example co-targeting of a promoter and its cognate enhancers (Fig. 2b), or targeting two promoters which may interact. As all 3C enrichment methods are not 100% efficient, co-targeting is significant source of bias which results in increased observed interaction between targeted sites; see Downes et al (2021) 10 for an experimental and mathematical description of this phenomenon. To avoid this bias, separate enrichments can be performed on aliquots of the same 3C material, targeting, for example, only enhancers and only promoters.
Comparative Samples-Capture-C enrichment can be performed on multiplexed samples in a single tube. This approach minimises the technical variation in enrichment, generating highly reproducible profiles for statistical analyses. All of the Capture-C methods are usually performed in triplicate (at least) and can therefore be used to compare different genetic models, or cell types in a single experiment. By performing experiments with triplicates, simple statistical tests (e.g. Student's t-test) can be used to compare interactions with specific regions, or more advanced approaches (e.g. DESeq2 82 ) can be used across entire domains of interaction. 3C interaction profiles from highly related cell-types or throughout differentiation can be remarkably similar. Therefore, it is often beneficial to compare samples with a highly unrelated cell type where elements of interest (e.g. enhancer or promoters) are inactive. It is important to note that there can be considerable technical variability between different cell types in the 3C procedure, which can result in differing levels of background noise (i.e. trans interactions) across cell types. Care should be taken to ensure comparative samples have similar noise levels. This can partially be controlled for by normalisation of interaction counts in cis rather than to total interactions, as, different levels of trans interactions can alter observed proximal interaction frequencies 10 after normalisation.
Tri-C design considerations-Tri-C viewpoints should be located on small (~150-250 bp) restriction fragments generated by the restriction enzyme used for chromatin digestion, which is usually NlaIII, since it has a smaller median fragment size compared to DpnII 11 .
The ~120 bp capture oligonucleotides should be designed to the middle of the restriction fragments on which the viewpoints of interest are located and repetitive sequences should be avoided.
Tiled-C design considerations-Similar considerations as for NuTi Capture-C apply to the design of capture oligonucleotides for Tiled-C. Probes for adjacent restriction fragments in regions of interest can be designed and filtered for repetitive sequences using Oligo 13 [https://oligo.readthedocs.io/en/latest/index.html]. When determining the extent of regions of interest captured it is useful to use low-resolution Hi-C as a guide for the existence and location of regulatory domains and their boundaries. Both the 3D Genome Browser 83 [http://3dgenome.fsm.northwestern.edu/view.php] and HiGlass 84 [https://higlass.io/] provide rich resources of easily accessible Hi-C data in a range of cell-types for this purpose. It is best to be generous in extending the tiled region beyond predicted boundaries for regions of interest to provide an informative regulatory context (Fig. 2b).

Data analysis
Multiple software packages exist for processing of Capture-C sequencing files. Reads from NuTi Capture-C and LI Capture-C experiments are compatible with HiC-Pro 85  . We find application of peaky, either after processing with CaptureSee 10 or CHiCAGO 90 gives highly specific interaction calls.
To facilitate consistent data processing, analysis and interpretation of NuTi Capture-C, Tri-C and Tiled-C data, we developed a computational tool called CapCruncher 91 [https:// github.com/sims-lab/CapCruncher/releases] to analyse all three experiment types. This pipeline utilises python and is both easy to install and run. CapCruncher processes raw fastq files, removes PCR duplicates, identifies reporter reads, and generates a UCSC Genome browser hub with depth normalised tracks for individual replicates and for the mean of replicates. When multiple samples are provided simultaneously, CapCruncher, also generates comparative tracks by subtracting sample means. For Tri-C and Tiled-C, CapCruncher generates visualisation matrices over targeted regions. The CapCruncher pipeline is available on GitHub and Bioconda, and, for testing, a small NuTi Capture-C test dataset can be found on the Gene Expression Ontology database (GSE129378) [https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE129378]. It should be noted that CapCruncher is under continuous development to add and enhance functionalities. We recommend that users read the corresponding manual [https://capcruncher.readthedocs.io/en/ latest/] provided online (the current manual is provided as Supplementary Manual). Below we describe the basic requirements and implementation of CapCruncher.

Expertise needed to implement the protocol
The experimental processes associated with Capture-C methods are common modern molecular techniques, including: restriction enzyme digestion, proximity ligation, phenolchloroform DNA extraction, quantitative and standard PCR, gel electrophoresis nextgeneration sequencing library preparation (including AMPure XP SPRI Bead clean-ups), streptavidin bead pull-down/washes and next-generation sequencing. The equipment for all of these processes (perhaps with exception of a sequencing platform) should be readily available in most research institutes, but where possible, alternatives are suggested. Following sequencing, analysis with CapCruncher, requires basic level unix command line operation which can be easily learnt. However, system administrator rights may be required to install tools, and advanced bioinformatics skills will aid in more complex analyses, such as interaction calling with peaky and other packages.

Configuration files
Three files are required for successful implementation of the Capture-C methods. For designing oligonucleotide probes, a bed format file (chromosome, start, stop, name) giving single base-pair coordinates to sites of interest is required for Capsequm2. For running of CapCruncher, a bed format file specifying the enriched regions (single fragments for NuTi Capture-C and Tri-C, and extended regions for Tiled-C), and a configuration file specifying the genome, mapping parameters, experimental method, and output directories are required. Examples of all three files are included as Supplementary Data 1-3.

Limitations
Due to the extremely high efficiency of on-target sequencing afforded by oligonucleotide pull-down, no selection is performed for successful digestion events (unlike Hi-C). Although excluding these relatively inefficient steps reduces the number of cells required for highresolution data, it does mean that quality control for a high-efficiency digestion is paramount to ensure sequence reads are not wasted. Quality control can be performed either by agarose gel, or more accurately with quantitative real-time PCR (Box 1). Using both methods is optimal. Based on analysis with the latter, 3C libraries should have a minimum 70% digestion for use.
Capture-C methods provide a temporal, population-based snapshot of active chromatin folding processes. To develop a more granular or dynamic perspective of interactions, Capture-C can be complemented with imaging approaches, particularly high-resolution FISH or live-cell imaging [92][93][94][95][96][97] . The requirement for single-cell suspensions also limits the application of Capture-C methods, since they are not suitable for complex tissues where mixed cell-types cannot be easily separated into pure populations, or for formalin-fixed paraffin embedded (FFPE) samples, such as biopsies and tissue-sections. In these cases, GAM 65,66 provides a superior ability to separate cell-types of interest and determine interaction dynamics.

•
Cells. Capture-C methods are possible in any eukaryotic species or cell-type where a single-cell suspension containing as few as 10,000-20,000 cells can be generated 9,13 . However, if available, using >100,000 will result in data of higher depth and resolution. Successful experiments have been performed previously in fly and chicken, as well as in numerous mouse cell types including embryonic stem cells (

Triton-X, 20% vol/vol
Mix 2 mL of Triton-X with 8 mL of PCR grade water. Store at RT long-term.

Ethanol, 70% vol/vol
Mix 7 mL of absolute ethanol with 3 mL of PCR grade water. Store at RT.

Ethanol, 80% vol/vol
Prepare fresh on day of use. Mix 8 mL of absolute ethanol with 2 mL of PCR grade water.

Procedure
CRITICAL The following protocol describes the generation of a single Nuclear 3C library (for use in any Capture-C method) with either DpnII or NlaIII, followed by indexing, with appropriate information for Tri-C and low-input sample modifications. Prior to oligonucleotide pull-down, uniquely indexed 3C libraries can be pooled for multiplexed capture. The volumes in this section describe a six library experiment (i.e. triplicates for two cell-types/genetic models) but can be scaled as necessary. Oligonucleotide pull-down can be carried out with either ssDNA oligonucleotides (first described for NG Capture-C 5 ) or with dsDNA oligonucleotides (first described for Tiled-C 13 ) and descriptions for both protocols are provided. A host of tools are available to analyse Capture-C experiments. Instructions are provided for processing of replicate samples with a portable python script, CapCruncher, which can process all three experiment types.

Viewpoint Preparation Oligonucleotide Probe Design
Timing 3 h-1. Use Capsequm2, Oligo or an equivalent tool to design appropriate probes for NuTi Capture-C, Tiled-C or Tri-C (see Experimental Design and Fig 1). CRITICAL STEP LI capture-C, NG Capture-C, NuTi Capture-C, and Tri-C have traditionally been performed with ssDNA oligonucleotides, whereas Tiled-C has been performed with dsDNA oligonucleotides. However, there is no reason why a specific method could not be performed with either ssDNA or dsDNA oligonucleotides, therefore both protocols are described. Follow the appropriate instructions for enrichment using either ssDNA oligonucleotides (step 75-124) or dsDNA oligonucleotides (step 125-162).

Oligonucleotide Stock Preparation
Timing 1 h-6. Reconstitute individual or pools of oligonucleotides following the manufacturer's instructions or to a stock concentration so that each unique oligonucleotide is stored at ≥1 μM.
7. If oligonucleotides were ordered individually, generate pools of oligonucleotides by mixing in exact 1:1 stoichiometric ratio and store at -20°C until required at step 81 (ssDNA Probes) or step 129 (dsDNA Probes). Step 1 95°C 20 s

Library Indexing
CRITICAL Sequencing adaptors are added by ligation after sonication. Where sonication is not possible, tagmentation can be used for indexing 9,100 , however custom blocking oligonucleotides may be required for capture.

Sonication
Timing 2 h-50. Bring 235 μL of AMPure XP SPRI beads to room temperature in a microcentrifuge tube (set aside). 57. Air dry SPRI beads at room temperature on magnetic stand until matt in appearance. CRITICAL STEP Take care not to over dry the beads as this will result in increased DNA losses; beads will look damp but not glossy when they are ready, overdried beads will develop cracks.
58. Remove from magnetic stand and re-suspend beads in 55 μL of PCR-grade water.
59. Incubate at room temperature for 2 min to elute. Replace on magnetic stand and once clear (~2 min) recover 53 μL.
60. Assess 1 μL of sonicated material using D1000 TapeStation (Fig 3b) 67. Perform an SPRI bead clean-up as described at steps 53-59 with 180 μL of AMPure XP SPRI beads. Elute in 59 μL of PCR-grade water and recover 28.5 μL into two PCR tubes. 70. Mix by pipetting and amplify DNA using the settings below for a total of six cycles of amplification.

PCR Addition of Indices
Step 1 98°C 30 s Step 2 98°C 10 s Step 3 65°C 30 s Step 4 72°C 30 s Step 5 Go to Step 2 Step 6 72°C 5 min Step 7 4°C Hold 71. Combine PCR reactions and perform an AMPure XP SPRI bead clean-up as described at steps 53-59 using 180 μL of AMPure XP SPRI beads. Elute in 55 μL of PCR-grade water and recover 53 μL into a new microcentrifuge tube.
72. Assess 1 μL of indexed material using D1000 TapeStation to ensure increase in fragment size (Fig. 3b). 112. Aliquot 50 μL of PCR mix into each of six PCR tubes and perform PCR using the following settings with a total of 10-14 cycles of amplification.

Quantify 2 μL of indexed library using
Step 1 98°C 45 s Step 2 98°C 15 s Step 3 60°C 30 s Step 4 72°C 30 s Step 5 Go to Step 2 Step 6 72°C 60 s Step 148. Briefly centrifuge to collect the material to the bottom of tube and place on magnetic stand. When solution is clear (1 min) discard the entire supernatant without disturbing the pellet.
149. Repeat heated Wash buffer 2 wash (steps 147-148) two times for a total of three washes.
150. After the third and final wash, collect residual buffer with a low volume pipette. Proceed immediately to the next step and do not allow the beads to dry.
151. Remove from the magnetic stand and resuspend in 90 μL of PCR-grade water (45 μL per reaction). Store on ice in preparation for PCR amplification.

Thaw KAPA HiFi HotStart ReadyMix and Amplification Primers on ice and mix.
154. To the streptavidin bead bound DNA add 100 μL of KAPA HiFi HotStart ReadyMix (50 μL per hybridisation reaction) and 10 μL of Amplification Primers (5 μL hybridisation reaction) and mix by pipetting.
155. Aliquot 50 μL of PCR mix into each of four PCR tubes (two per hybridisation reaction) and perform PCR with a total of 10-14 cycles of amplification.
Step 1 98°C 45 s Step 2 98°C 15 s Step 3 60°C 30 s Step 4 72°C 30 s Step 5 Go to Step 2 Step 6 72°C 60 s Step 7 4°C Hold 156. Pool four reactions in a microcentrifuge tube and place on a magnetic stand.
157. When clear (30 s), transfer supernatant to a new microcentrifuge tube containing 360 μL of AMPure XP beads (180 μL per hybridisation reaction) and perform bead clean-up as per step 53-59 using 360 μL of AMPure XP SPRI beads. Elute into 56 μL of PCR-grade water and recover 53 μL.
158. OPTIONAL Confirm size of amplified DNA using a high sensitivity D1000 tapestation.
159. Use 2 μL of amplified material in a Qubit dsDNA HS assay kit to quantify the DNA. ? TROUBLESHOOTING Double Capture (dsDNA Probes) Timing 2 d-CRITICAL When using optimally titrated probes, double capture increases the on-target sequencing efficiency by 2-3 fold over single capture. The amount of DNA recovered after single capture is generally <2 μg so capture is performed as described for a single library using all of the recovered material. For Tiled-C, the high density of probes leads to an extremely high efficiency enrichment and a second capture is not required. If performing Tiled-C proceed to Sequencing and Analysis (step 163).
160. Perform Hybridisation (steps 125-137) as described using volumes for a single reaction.
161. Perform Streptavidin Bead Binding (steps 138-151) as described using volumes for a single reaction.
162. Perform PCR Amplification (steps 152-159) as described using volumes for a single reaction.
PAUSE POINT Captured DNA may be stored at -20°C for several months until sequencing (step 163)

Sequencing and Analysis
Sequencing Timing 2 d-163. Using the measured DNA concentration, make a 10 nM dilution of amplified captured DNA from Step 124 or 162.
164. Perform accurate library quantification of the 10 nM dilution using quantitative PCR with size correction. We recommend using KAPA Library Quantification Kit with 1:10,000 and 1:20,000 dilutions.
165. Dilute DNA to appropriate concentration for sequencing (generally 4 nM) and sequence with paired-end reads. Libraries should be sequenced to a depth of 1-5×10 5 reads per viewpoint per sample for NuTi Capture-C, 1-10×10 6 reads per viewpoint per sample for Tri-C, and 3-5×10 6 reads per Mb per sample for Tiled-C, which is sufficient for 5 kb resolution. CRITICAL STEP Using long reads (150 bp) allows the reconstruction of sequencing fragments. From these fragments it is possible to detect restriction digestion sites and in silico digest the chimeric reads generated by 3C. This step is essential for Tri-C experiments where multi-way interactions are detected, but not for Tiled-C and NuTi Capture-C where using shorter reads (40-75 bp) can reduce sequencing costs.

CapCruncher analysis
Timing ~1 d. Will vary depending on viewpoint number and sequencing depth -CRITICAL In this section, we provide a step-by-step description of how to use the CapCruncher 91 pipeline using triplicate many-verus-all capture of the HBA1, HBA2, HBB, HBD, MYC and SLC25A37 genes in human erythroid and ES cells 10 (GSE129378).
Installation (step 166) needs only be implemented once. In this walk-through we assume that a Conda environment on a Linux operating system is in use. Full descriptions for using the software can be found on the GitHub page (https://github.com/sims-lab/CapCruncher/). Modifications may be required in the commands below when using different operating systems. Key difference for analysing Tiled-C and Tri-C data are highlighted, please refer to the software manuals and relevant GitHub pages for full documentation. Commands starting with '>' are executed in the command line.
The key metrics of a Capture-C experiment are the alignment filtering statistics, where capture efficiency and reporter content are measured (Fig. 5b). Titration of capture oligonucleotides will result in 80-98% of mapped reads containing a target capture fragment. Lower percentages may indicate that probes were not used at the correct concentration, hybridisation conditions/buffers were not optimal, or that off target capture was a significant factor. Of the capture containing fragments, 60-80% should also contain a reporter. A portion of the reads filtered out at this step are contained in the capture-adjacent fragment, arising from religation of DNA into its original confirmation. Unlike Hi-C, the Capture-C methods do not perform enrichment for successful digestion and ligation events. Therefore, unflashed capture-containing fragments may lack a restriction enzyme site, which occurs when a sonicated fragment is entirely contained within the viewpoint restriction fragment, or contains a ligation junction with its adjacent restriction fragment. Poor digestion efficiency of a 3C library will significantly increase the proportion of these fragments, lowering the informative proportion of reads. Reporter statistics provide the per viewpoint count of reporters in both cis and trans (Fig. 5c). High quality 3C libraries and capture provide over 100,000 cis reporters per viewpoint, which should make up >60% of all reporters, however, the cis/trans ratio is variable amongst viewpoints and can be affected by nuclear positions and fragment length. It's important to note that outlier viewpoints that have many more trans interactions than other veiwpoints in the same experiment may have mismapping issues; care should be taken when interpreting results for these viewpoints. Despite providing the ability to generate 3C profiles with over 100,000 reporters, interpretable profiles can be generated from replicates with a few thousand reporters, as long as a high quality 3C library is used.

Troubleshooting
Troubleshooting advice can be found in Table 3.

Box 1 3C Quality Control
The success of Capture-C methods relies upon the generation of high quality 3C material, with a high degree of digestion. To ensure the quality of 3C material two controls are prepared (steps 17 and 25). Control 1 contains undigested material and ensures DNA was not degraded prior to digestion. Control 2 contains digested, but un-ligated chromatin. These two controls, along with the ligated 3C library are assessed both qualitatively, using gel electrophoresis, and quantitatively, using real-time PCR.

Qualitative Analysis
To visually inspect 3C library and control DNA, separate using a 1% (wt/vol) agarose gel with a moderate speed, ~70 mA, or using a Genomic DNA ScreenTape in a TapeStation instrument (Fig. 2a). Because of the low DNA requirements, a TapeStation is preferable when working with low-input samples 9 . Control 1 should contain a single band of high-molecular weight DNA that is not degraded. Control 2 should contain a smear of low molecular weight fragments. Low digestion efficiency can be associated with a faint band of high molecular weight DNA. The ligated 3C library should have increased in molecular weight due to concatenation, and resemble Control 1. Although complete ligation will result in a greater number of informative junctions being formed, libraries with partial ligation can still be used for Capture-C. Where DNA is limiting, most commonly when working with low cell numbers, it is possible to asses both Controls and 3C material using a genomic screentape in a TapeStation, looking for a similar pattern of DNA distribution for each of the three samples. Qualitative assessment, using Tapestation profiles with small amounts of DNA can also be used to ensure indexing reactions proceed as expected (Fig. 2b).

Quantitative Analysis
Real time PCR is performed with primers that amplify across a restriction cut site, Cut-site Primers, or within a restriction fragment, Fragment Primers (Fig. 3c, Table 1). Fragment Primers will amplify to the same extent in both Control 1 and Control 2, providing a loading control for quantitative PCR (Fig. 3c). The Cut-site Primers will readily amplify in Control 1, but due to digestion, have reduced amplification in Control 2. This difference in amplification allows the quantification of digestion efficiency ( Table  2). We recommend that libraries have at least a 70% digestion efficiency for use. When working with low-input samples, it is possible to determine cutting efficiency using the re-ligated 3C library and a genomic DNA control, however this will result in a lower calculated digestion efficiency due to re-ligation of DNA into its original configuration.

Modifications for Increased Specificity
The high-resolution and depth of signal achieved by Capture-C is due to its ability to specifically sequence target fragments, resulting in high numbers of unique reporters per viewpoint 46 . Two additive protocol steps provide this high-specificity sequencing. The first adaptation, named double capture, uses repeated enrichment to achieve a 160-fold increase in target sequence over single capture, generating 30-50% on-target sequence (Fig. 4). The second adaptation uses titration of oligonucleotides to reduce non-specific enrichment, generating 30-40% on-target sequencing following a single capture 10 . When the two methods are combined, up to 98% of mapped read pairs contain the target fragment.
As the number of probes varies between capture designs, a specific probe concentration must be calculated for each capture. The optimal concentration for capture with a single oligonucleotide is ~2.9 nM. For a pool of oligonucleotides, the DNA concentration is simply scaled by the number of unique oligonucleotides.

Required Pool Concentration = 2.9nM × Number of Probes
It is important to note that this value was determined using mammalian cells, when working with other organisms a rule of thumb adjustment may be appropriate. For example, the Drosophila genome is roughly 15 times smaller than the human genome, so a target fragment will be 15 times more common when capture is performed on the same mass of indexed 3C library. Scaling the amount of probe (increase concentration 15-fold) or DNA (decrease amount 15-fold) to reflect this will increase the efficiency of capture. However, it is important to note the higher amounts of DNA are likely to generate better results due to the increased complexity of the library.
Pooled probe stocks are generated by combing equimolar amounts of probes at 1 μM (or a value greater than the calculated Required Pool Concentration). The concentration of this pooled stock is 1 μM, which can then be diluted to the calculated Required Pool Concentration as needed.
Although double capture is essential for high-specificity sequencing when targeting dispersed elements (NuTi Capture-C, Tri-C) it provides little benefit to contiguous designs (Tiled-C). For Tiled-C, the combination of titrated probes and enrichment at both sides of interaction junctions provides highly specific sequencing; 80-90% on target following single capture 13 . a, The Capture-C family of methods involves three distinct modules. In the first module a Nuclear 3C library is generated from 2% formaldehyde fixed cells that are lysed, then permeabilised with SDS, and digested with a frequent 4-base cutter (DpnII or NlaIII).
Proximity ligation re-arranges the genome order to reflect spatial 3D organisation. Finally, for this module, centrifugation is used to separate DNA from ruptured nuclei from DNA in intact nuclei, which contains more informative 3C material. Library indexing in module 2 is performed using standard next-generation sequencing kits with sonication providing unique ends for PCR duplicate filtering. For Tri-C, gentler sonication is used to generate longer fragments which contain multiple ligation junctions. The third module is the most diverse, with a unique oligonucleotide design for each method. NuTi Capture-C uses a pair of oligonucleotides from the same strand of DNA that overlap restriction digestion sites of disperse fragments. For Tiled-C the same approach is used, however contiguous fragments are targeted and double stranded oligonucleotides have typically been used. In Tri-C a single oligonucleotide in the centre of a short restriction fragments enriches for sonication fragments with multiple ligation junctions. b, Schematic of results for a hypothetical locus, with one gene (red) and two enhancers (purple circles). NuTi Capture-C, or the low-cell variation LI Capture-C, from the promoter can be used to show direct interactions with both enhancers, Tiled-C produces a Hi-C like interaction map showing the three elements are in a TAD-like regulatory domain, and Tri-C shows that the two enhancers can be found simultaneously interacting with each other and the promoter at single alleles.         Table 3 Troubleshooting table. Step

Reagent Setup
Excess Lysis buffer Small number of samples.
To make smaller volumes of lysis buffer, one cOmplete Protease Inhibitor Cocktail tablet can be dissolved in 2 mL of PCR grade water to generate a 25× stock. This can be aliquoted and stored at -20°C for several months.

9
Fewer than 5 x 10 6 cells Working with a rare cell population or limited number of cells following cell sort.
For fixation, PBS wash and lysis the volumes can be scaled down to accommodate fewer cells (down to 2 x 10 4 cells). Maintain cells at ~1 x 10 6 cells per 1 mL of growth media except for ≤1 x 10 6 cells where 1 mL of media should be used and fixation and lysis performed in a 1.5 mL tube. Perform digestion reactions in 200 μL for between 2 x 10 4 and 5 x 10 6 cells.
More than 5 x 10 6 cells Working with a cell line For fixation, PBS wash and lysis the volumes can be scaled up to accommodate more cells. Maintain cells at ~1 x 10 6 cells per 1 mL of growth media. For greater numbers of cells, perform multiple, parallel digestions and combine material in 300 μL of TE buffer after nuclear isolation. For low-input samples (≤150,000 cells), where very little DNA is available for controls, digestion efficiency can be directly calculated from re-ligated 3C libraries against a genomic DNA input control. Note that due to re-ligation into the original fragment configuration, lower values for digestion will be observed than for a true digestion control.

49
Low digestion efficiency Short digestion period or sub-optimal enzyme activity The total digest time should be 20-24 hours. Additional restriction enzyme can be added at each optimal enzyme activity of the three timepoints (steps 21-23) for cells generating low digestion efficiency.

Non-exponential amplifiction
Reaction conditions for primers not optimized to thermocycler Perform a dilution series analysis wigh genomic DNA and include a melt curve to ensure no primer dimers are being produced.

61
DNA not at correct size

Sonicator settings not optimized
Each sonicator may vary and should be set accordingly. Settings for sonication should be first determined by testing with high molecular weight genomic DNA rather than wasting 3C library. It is important to take into account the mass of DNA being sheared. Streptavidin beads tend to stick to plastics. We find this effect is minimised by using high-quality, non-sticky tubes, from Sorenson BioScience (39640T).

159
Loss of DNA after capture Failed PCR reaction, user error during DNA bead clean-up Captured material is amplified off the beads in four PCR reactions (two per hybridisation reaction). Here, these reactions are performed simultaneously, though it is possible to do these in two batches to protect against error or misfortune and to determine if adequate amplification has occurred. 172 Tiled-C matrix not generated Using coodinates for a single viewpoint not a region Change the bed file coordinates to match the Tiled-C targeted region including the start of the first targeted fragment and the end of the and last targeted fragment.
Interaction matrix not generated Using Capture-C configuration settings Set analysis method in config.xml to either "tiled" for Tiled-C or "tri" for Tri-C.